Skip to content

Conversation

sureshanaparti
Copy link
Contributor

@sureshanaparti sureshanaparti commented Sep 12, 2025

Description

This PR improves migrate volume, to bypass secondary storage when copy volume between pools is allowed directly.

Also, includes the following improvements:

  • Updates the pool type of the volume wherever required.
  • Updates suitable disk offering(s) for volume(s) after migrate VM with volumes when there is a change in pool type (shared or local) and suitable disk offering is available. Migrate VM with volume(s) bypasses the service and disk offerings of the volumes, as the target pools for migration are specified, this is the expected behavior currently. But, there seems to be problem when change in pool type (shared or local).
  • Updates error message when volume migration between local to shared & shared to local without respective disk offering.
  • Updates Endpoint selection to consider host scope first while copying between primary storages.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Tested (offline) volume migration between:

  • NFS pools in the cluster
  • Local Storage to NFS and vice versa
  • Ceph to NFS / Local Storage

How did you try to break this feature and the system with this change?

@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Sep 12, 2025

Codecov Report

❌ Patch coverage is 26.50602% with 122 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.53%. Comparing base (0108ffd) to head (a0a3038).
⚠️ Report is 65 commits behind head on main.

Files with missing lines Patch % Lines
...ud/hypervisor/kvm/storage/KVMStorageProcessor.java 0.00% 26 Missing ⚠️
...tack/storage/motion/AncientDataMotionStrategy.java 52.83% 11 Missing and 14 partials ⚠️
...torage/motion/StorageSystemDataMotionStrategy.java 0.00% 23 Missing ⚠️
...tack/storage/endpoint/DefaultEndPointSelector.java 0.00% 7 Missing ⚠️
...ava/com/cloud/storage/dao/DiskOfferingDaoImpl.java 40.00% 6 Missing ⚠️
...stack/engine/orchestration/VolumeOrchestrator.java 33.33% 4 Missing ⚠️
...n/java/com/cloud/storage/VolumeApiServiceImpl.java 60.00% 3 Missing and 1 partial ⚠️
...ack/engine/subsystem/api/storage/ClusterScope.java 0.00% 3 Missing ⚠️
...dstack/engine/subsystem/api/storage/HostScope.java 40.00% 3 Missing ⚠️
...dstack/engine/subsystem/api/storage/ZoneScope.java 0.00% 3 Missing ⚠️
... and 14 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11625      +/-   ##
============================================
+ Coverage     17.39%   17.53%   +0.14%     
- Complexity    15283    15473     +190     
============================================
  Files          5889     5897       +8     
  Lines        526183   527356    +1173     
  Branches      64242    64407     +165     
============================================
+ Hits          91541    92492     +951     
- Misses       424298   424456     +158     
- Partials      10344    10408      +64     
Flag Coverage Δ
uitests 3.60% <ø> (-0.02%) ⬇️
unittests 18.59% <26.50%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 14980

@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the volume migration functionality to bypass secondary storage when direct copy between pools is possible, improving performance for supported pool types.

Key changes:

  • Added poolType parameter to volume import/update operations to track storage pool types
  • Implemented logic to determine when secondary storage can be bypassed during volume migration
  • Updated volume database records to include pool type information when pools change

Reviewed Changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
AncientDataMotionStrategy.java Adds bypass logic for supported pool types (NFS, Filesystem) with scope validation
VolumeOrchestrator.java Updates import/update methods to include poolType parameter
Multiple volume management files Adds setPoolType() calls when updating volume pool assignments
KVMStorageProcessor.java Fixes path handling logic for volume copying between pools
VMSnapshotManagerImpl.java Corrects spelling error in exception messages
Scope classes Adds toString() methods for better debugging

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 14981

@weizhouapache
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14312)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 53901 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11625-t14312-kvm-ol8.zip
Smoke tests completed. 146 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_03_deploy_and_scale_kubernetes_cluster Failure 1.16 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 28.84 test_kubernetes_clusters.py

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

not tested yet

@sureshanaparti sureshanaparti force-pushed the migrate-volume-improvements-to-bypass-sec-store branch from d646d67 to 89813ff Compare September 16, 2025 08:13
@sureshanaparti sureshanaparti marked this pull request as ready for review September 16, 2025 08:13
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 15073

@sureshanaparti
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14425)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 53661 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11625-t14425-kvm-ol8.zip
Smoke tests completed. 146 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_03_deploy_and_scale_kubernetes_cluster Failure 1.18 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 34.99 test_kubernetes_clusters.py

…volumes when change in pool type (shared or local)

Currently, Migrate VM with volume(s) bypasses the service and disk offerings of the volumes, as the target pools for migration are specified,
which ignores the offerings. Offering change is required when pool type (shared or local) is changed, mainly
- when volume on shared pool is migrated to local pool
- when volume on local pool is migrated to shared pool
…offering type mismatches (both are not shared/local)
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 15177

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14452)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 53643 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11625-t14452-kvm-ol8.zip
Smoke tests completed. 147 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@kiranchavala kiranchavala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested the following scenarios and migrations bypassed the secondary storage

Screenshot 2025-10-06 at 11 11 57 AM
Test Case Execution Result
Migrate volumes from nfs storage to another nfs storage in the same cluster Pass
Migrate a volume from local storage to nfs storage Pass
MIgrate a volume from nfs storage to local storage Pass
Migrate a volume from Ceph storage to a nfs storage Pass
Migrate a volume from a nfs storage to ceph storage Pass
Migrate a volume from local to ceph storage Pass
Migrate a volume from ceph storage to local storage Pass

@weizhouapache
Copy link
Member

LGTM, tested the following scenarios and migrations bypassed the secondary storage

Screenshot 2025-10-06 at 11 11 57 AM Test Case Execution Result Migrate volumes from nfs storage to another nfs storage in the same cluster Pass Migrate a volume from local storage to nfs storage Pass MIgrate a volume from nfs storage to local storage Pass Migrate a volume from Ceph storage to a nfs storage Pass Migrate a volume from a nfs storage to ceph storage Pass Migrate a volume from local to ceph storage Pass Migrate a volume from ceph storage to local storage Pass

perfect testing @kiranchavala

just a question, have you tested online/offline ceph to ceph migration ?

… offerings with tags mismatch with storage tags
@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@sureshanaparti
Copy link
Contributor Author

perfect testing @kiranchavala

just a question, have you tested online/offline ceph to ceph migration ?

@weizhouapache I think, it's offline ceph to ceph migration. @kiranchavala can you confirm.

@kiranchavala
Copy link
Contributor

perfect testing @kiranchavala
just a question, have you tested online/offline ceph to ceph migration ?

@weizhouapache I think, it's offline ceph to ceph migration. @kiranchavala can you confirm.

No, I didn't test ceph to ceph offline/online migration

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 15316

Copy link
Contributor

@harikrishna-patnala harikrishna-patnala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@harikrishna-patnala harikrishna-patnala merged commit f67b738 into apache:main Oct 9, 2025
27 of 28 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Apache CloudStack 4.22.0 Oct 9, 2025
@harikrishna-patnala harikrishna-patnala deleted the migrate-volume-improvements-to-bypass-sec-store branch October 9, 2025 10:30
@shwstppr
Copy link
Contributor

Simulator CI seems to be been failing due to this PR change. I was failing in this PR and now in main @harikrishna-patnala @sureshanaparti

Run echo -e "Simulator CI Test Results: (only failures listed)\n"
Simulator CI Test Results: (only failures listed)

+-----------------------------+--------------------+--------+-----------------+
|            Test             |       Result       |  Time  |    Test file    |
+=============================+====================+========+=================+
| test_09_stop_vm_migrate_vol | builtins.Exception | 24.431 | test_stopped_vm |
+-----------------------------+--------------------+--------+-----------------+
Error: Process completed with exit code 1.

dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Oct 17, 2025
…lume between pools is allowed directly (apache#11625)

* Migrate volume improvements, to bypass secondary storage when copy volume between pools is allowed directly

* Bypass secondary storage for copy volume between zone-wide pools and
- local storage on host in the same zone
- cluser-wide pools in the same zone

* Bypass secondary storage for volumes on ceph/rdb pool when the scope permits

* Fix dest disk format while migrating volume from ceph/rbd to nfs, and some code improvements

* unit tests

* Update suitable disk offering(s) for volume(s) after migrate VM with volumes when change in pool type (shared or local)

Currently, Migrate VM with volume(s) bypasses the service and disk offerings of the volumes, as the target pools for migration are specified,
which ignores the offerings. Offering change is required when pool type (shared or local) is changed, mainly
- when volume on shared pool is migrated to local pool
- when volume on local pool is migrated to shared pool

* Update with proper message while migrate volume when target pool and offering type mismatches (both are not shared/local)

* Consider host scope first during endpoint selection while copying between primary storages

* Update disk offering count (for listDiskOfferings api) while removing offerings with tags mismatch with storage tags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants